Provable training set debugging for linear regression
نویسندگان
چکیده
We investigate problems in penalized M-estimation, inspired by applications machine learning debugging. Data are collected from two pools, one containing data with possibly contaminated labels, and the other which is known to contain only cleanly labeled points. first formulate a general statistical algorithm for identifying buggy points provide rigorous theoretical guarantees when follow linear model. then propose an tuning parameter selection of our Lasso-based guarantees. Finally, we consider two-person “game” played between bug generator debugger, where debugger can augment set versions original pool. develop analyze debugging strategy terms Mixed Integer Linear Programming (MILP). empirical results verify utility MILP strategy.
منابع مشابه
Training Set Debugging Using Trusted Items
Training set bugs are flaws in the data that adversely affect machine learning. The training set is usually too large for manual inspection, but one may have the resources to verify a few trusted items. The set of trusted items may not by itself be adequate for learning, so we propose an algorithm that uses these items to identify bugs in the training set and thus improves learning. Specificall...
متن کاملFast Active-set-type Algorithms for L1-regularized Linear Regression
In this paper, we investigate new active-settype methods for l1-regularized linear regression that overcome some difficulties of existing active set methods. By showing a relationship between l1-regularized linear regression and the linear complementarity problem with bounds, we present a fast active-set-type method, called block principal pivoting. This method accelerates computation by allowi...
متن کاملInterval linear regression
In this paper, we have studied the analysis an interval linear regression model for fuzzy data. In section one, we have introduced the concepts required in this thesis and then we illustrated linear regression fuzzy sets and some primary definitions. In section two, we have introduced various methods of interval linear regression analysis. In section three, we have implemented nu...
متن کاملDebugging Inconsistent Answer Set Programs
In this paper we examine how we can find contradictions from Answer Set Programs (ASP). One of the most important phases of programming is debugging, finding errors that have crept in during program implementation. Current ASP systems are still mostly experimental tools and their support for debugging is limited. This paper addresses one part of ASP debugging, finding the reason why a program d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2021
ISSN: ['0885-6125', '1573-0565']
DOI: https://doi.org/10.1007/s10994-021-06040-4